Runtime Compression of MPI Messages to Improve the Performance and Scalability of Parallel Applications

نویسندگان

  • Jian Ke
  • Martin Burtscher
  • Evan Speight
چکیده

Communication-intensive parallel applications spend a significant amount of their total execution time exchanging data between processes, which leads to poor performance in many cases. In this paper, we investigate message compression in the context of large-scale parallel message-passing systems to reduce the communication time of individual messages and to improve the bandwidth of the overall system. We implement and evaluate the cMPI message-passing library, which quickly compresses messages on-the-fly with a low enough overhead that a net execution time reduction can be obtained. Our results on six large-scale benchmark applications show that execution speed improves by up to 98% when message compression is enabled.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic 13: High Performance Network and Communication

This topic on High-Performance Network and Communication is devoted to communication issues in scalable compute and storage systems, such as parallel computers, networks of workstations, and clusters. All aspects of communication in modern systems were solicited, including advances in the design, implementation, and evaluation of interconnection networks, network interfaces, system and storage ...

متن کامل

Performance Modeling of Gyrokinetic Toroidal Simulations for a Many-Tasking Runtime System

Conventional programming practices on multicore processors in high performance computing architectures are not universally effective in terms of efficiency and scalability for many algorithms in scientific computing. One possible solution for improving efficiency and scalability in applications on this class of machines is the use of a many-tasking runtime system employing many lightweight, con...

متن کامل

Reducing Communication Time through Message Prefetching

The latency of large messages often leads to poor performance of parallel applications. In this paper, we investigate a novel latency reduction technique where message receivers prefetch messages from senders before the matching sends are called. When the send is finally called, only the parts of the message that have changed since the prefetch need to be transmitted, resulting in a smaller mes...

متن کامل

Constructing Resiliant Communication Infrastructure for Runtime Environments

Next generation HPC platforms are expected to feature millions of cores distributed over hundreds of thousands of nodes, leading to scalability and fault-tolerance issues for both applications and runtime environments dedicated to run on such machines. Most parallel applications are developed using a communication API such as MPI, implemented in a library that runs on top of a dedicated runtime...

متن کامل

Performance of Multicore Systems on Parallel Datamining Services

Multicore systems are of growing importance and 64128 cores can be expected in a few years. We expect datamining to be an important application class of general importance and are developing such scalable parallel algorithms for managed code (C#) on Windows. We present a performance analysis that compares MPI and a new messaging runtime library CCR (Concurrency and Coordination Runtime) with Wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004